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Abstract 


The weak call-by-value -calculus L and Turing machines can simulate each other with a polynomial 
overhead in time. This time invariance thesis for L, where the number of -reductions of a computation 


is taken as its time complexity, is the culmination of a 25-years line of research, combining work by 
Blelloch, Greiner, Dal Lago, Martini, Accattoli, Forster, Kunze, Roth, and Smolka. The present 
paper presents a mechanised proof of the time invariance thesis for L, constituting the first mechanised 
equivalence proof between two standard models of computation covering time complexity. 

The mechanisation builds on an existing framework for the extraction of Coq functions to L and 
contributes a novel Hoare logic framework for the verification of Turing machines. 

The mechanised proof of the time invariance thesis establishes L as model for future developments 
of mechanised computational complexity theory regarding time. It can also be seen as a non-trivial 
but elementary case study of time-complexity-preserving translations between a functional language 
and a sequential machine model. As a by-product, we obtain a mechanised many-one equivalence 
proof of the halting problems for L and Turing machines, which we contribute to the Coq Library of 
Undecidability Proofs. 
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M Introduction 


Computability theory — i.e. the study of which problems can be solved computationally — 
is invariant under the chosen model of computation: Any Turing-complete model does the 
job. Similarly, but less generally, computational complexity theory — i.e. the study of how 
efficiently problems can be solved computationally — is invariant under the chosen model: 
Both Turing machines and RAM machines are used and can be exchanged as long as only 
complexity classes closed under polynomial time reductions are of interest. This practice is 
based on the invariance thesis, introduced by Slot and van Emde Boas as “all reasonable 
models of computation simulate each other with a polynomially bounded overhead in time 
and a constant factor overhead in space” [23]. For the present paper, we dub the first half 
concerning time complexity the time invariance thesis. 
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Mechanised Time Invariance Thesis for L 


While RAM machines and Turing machines are reasonable in this sense, the question how 
and whether the (untyped) A-calculus [4] is reasonable was a long-standing open question. 
This may sound especially surprising given the fact that the A-calculus was one of the first 
models for computability to be proven Turing-complete, by Turing himself [27]. On the other 
hand, the A-calculus is indeed more complicated than sequential models: Terms are trees 
and contain binders, computation is non-local, multiple potentially non-equivalent reduction 
strategies can be chosen, etc. Maybe most crucially, it is not obvious what a reasonable 
time complexity measure is: The number of -steps in a computation? Or does one have to 
account for (the size of) $-redexes? And even more unclear: What is the space complexity 
of a A-calculus computation? 

More or less independently of these questions, one direction of the time invariance thesis 
is easy to prove: The A-calculus can simulate Turing machines. 

In this paper, we focus on the weak call-by-value A-calculus introduced by Plotkin [21], 
in the concrete variant L introduced by Forster and Smolka [12]. In L, only abstractions are 
values, and we take the number of (-steps as time complexity measure of a computation [7]. 
L is similar to an ML-like functional programming language: Reductions in abstractions are 
not allowed, and a -reduction only applies when both sides of an application are a value. 

A short timeline of time complexity for L and the full A-calculus looks as follows: 


1995 Blelloch and Greiner [3] prove the time invariance thesis for the weak call-by-value 
A-calculus w.r.t. RAM machines, with the number of 8-steps as time complexity measure. 

2008 Dal Lago and Martini [17] prove the time invariance thesis for the weak call-by-value 
A-calculus w.r.t. Turing machines, with the sum over all differences of the size of the 
redex and the size of the reduct as time complexity measure. 

2015 Accattoli and Dal Lago [1] prove the time invariance thesis for the full -calculus w.r.t. 
RAM machines, with the number of -steps as time complexity measure. 

2020 Forster, Kunze, and Roth [7] prove the invariance thesis for L w.r.t. Turing machines, 
with the number of (-steps as time complexity measure and the maximum of the sizes of 
all intermediate terms as space complexity measure. 


All of these results come with different levels of formality in their proofs. Blelloch and 
Greiner [3] only state their result, but do not give any details of a proof, much less provide 
details how the (RAM)-machine carrying out the simulation might look like. Dal Lago and 
Martini [17] give a proof sketch: They informally describe what a Turing machine simulating 
L has to do, and for instance how many tapes it has, but do not give invariants for inductions. 
The other, easier simulation direction is discussed in similar detail. In a later note, Dal 
Lago and Accattoli [16] give a formal proof on paper that the A-calculus — independent from 
the concrete variant used — can simulate Turing machines with linear overhead. The proof 
of the time invariance thesis for the full -calculus by Accattoli and Dal Lago [1] is based 
on a careful and formal study of explicit sharing, but again the implementation in terms 
of machines is left out. This technique is omnipresent in theoretical computer science and 
mathematics: The folklore parts of results are only sketched or are entirely left out, while 
the interesting parts are formalised and detailed proofs are given. 

When mechanising a result in an interactive theorem prover, both aspects form their 
own challenges: For folklore results, first a proof has to be found, then formalised, then 
mechanised, and each individual step can prove challenging. For a formal proof, only missing 
details of an argument have to be recovered, which can still impose challenges. 

The proof of the invariance thesis for L [7] is accompanied by a mechanisation for one 
part of the novel contribution, namely two stack machine semantics for L. Since it is folklore 
that concrete algorithms can be implemented on Turing machines, this part is only sketched. 
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In the present paper, we give a mechanised proof of the time invariance thesis for L in 
Coq [26], providing all details left out in existing proofs. We mechanise two variants of the 
time invariance thesis for L. The first provides a Turing machine Msim, simulating L based 
on a stack machine for L with a heap, and vice versa a term Ssim simulating Turing machines: 


> Theorem 1 (Time Invariance Thesis for L w.r.t. Simulation). There are a Turing machine 
Mgim, an L-term Ssim, and encoding relations of L-terms and heaps on tapes, and Turing 
machines and tapes as L-terms s.t. 
1. Msim simulates L with polynomial overhead, i.e. there is a polynomial p s.t. for all closed s: 
a. Whenever s terminates in i steps with value v, Msim run on an input tape encoding s 
terminates in p(i, |s|) steps with an encoding of a heap containing v on its output tape, 
where |s| is the size of s. 
b. If Mgim terminates on the encoding s, then s terminates. 
2. Ssim simulates TMs with linear overhead, i.e. there is a constant c s.t. for all M: 
a. Whenever M : TMS terminates on tapes t in i steps with result tapes t’, Ssim run on 
an encoding of both M and t evaluates to the encoding of t in c-i-|M| +c steps. 
b. If Ssim terminates on the encodings M and t, then M terminates on t. 


As a by-product, this theorem yields the first mechanised proof that the halting problems 
for L and Turing machines are many-one equivalent, which we contribute as a corner stone 
to the Coq Library of Undecidability Proofs [11]. 

Two subtleties are important to point out: First, an L-computation does not have explicit 
input and output. Any L-term sọ can compute on its own. Input can be realised by 
application to an (encoded) value (s 7%) and the result of the computation can be considered 
its output. In contrast, a Turing machine M can not compute without the value of its tapes 
specified, unless one explicitly defines the canonical value of tapes to be e.g. empty. Secondly, 
the theorem depends on notions of encoding an L-term on a tape, of encoding heaps on a 
tape, and of encoding a Turing machine and tapes as L-terms. The choice of such encodings 
is not canonical, and they have to fulfil certain properties for the theorem to be meaningful. 
For instance, the unfolding of a heap to an L-term on a Turing machine has to be (at most) 
polynomial in time, otherwise the theorem is meaningless. 

As is well-known, computation in L and the \-calculus in general is potentially subject 
to so-called “size explosion”, i.e. the size of a result of a computation can be exponentially 
larger than both the size of the input and the number of steps. Thus, in the above theorem, 
unfolding the heap containing v will be polynomial in the size of v and i, but the size of v 
might be exponential in both the size of s and 7. The following version of the time invariance 
thesis abstracts away from such subtleties by only considering the computability of k-ary 
relations on boolean strings, while still being as transparent as possible: 


> Theorem 2 (Time Invariance Thesis for L w.r.t Computability). Let R C (LB)! x LB. Then 
1. If R is L-computable with time complexity function Tr, then there is 
a polynomial p s.t. R ts TM-computable with time complexity func- 


tion (Mm, ni, ... Nk) > p(m,ni,...,Nk, TR(N1,...,Mk)). 

2. If R is TM-computable with time complexity function Tr, then there is a constant c 
s.t. R is L-computable with time complexity function (m, ni,... nk) > com- ni nk: 
TR(N1,.-., Mk) +c. 


We accomodate for size explosion by making the time complexity function of a relation 
R also depend on the size m of the output. Of course, for Turing machines m is always 
bounded linearly by 71,...,n, and i. 
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Mechanised Time Invariance Thesis for L 


Two corollaries are immediate: If the output of a relation R is polynomially bounded by 
its input, R is polynomially TM-computable if and only if it is polynomially L-computable. 
In particular, the complexity class P (i.e. PTIME) agrees for both L and Turing machines. 

Note that the theorem still depends on encodings, namely on the precise definitions 
of a relation R being TM- and L-computable. However, the two notions can be assessed 
independently: L-computability only depends on how to encode boolean strings (LB), and 


similar for TM-computability. Unsurprisingly, both encodings are very simple, and it is 
obvious that e.g. equality checking works in linear time. 

The theorems are furthermore equivalent, in the sense that one can prove one from the 
other without additional complex simulations: We obtain our mechanised proof of the time 
invariance thesis w.r.t. computability from the time invariance thesis w.r.t. simulation, by 


defining terms and machines dynamically exchanging between the encodings of LB on Turing 


machines, LB in L, L-terms on Turing machines, and Turing machines as L-terms. The 


other direction would be possible by verifying a universal Turing machine with polynomial 
overhead in time for Turing machine computation, and similarly a universal L-term. 

We avoid non-polynomial overhead for terms exhibiting size explosion by relying on heaps. 
However, this means that the space overhead is linear-logarithmic rather than constant 
factor, due to terms exhibiting pointer explosion. A mechanised proof of the time and space 
invariance thesis for L is left for future work. 

The present work is based on the certifying extraction framework for L by Forster and 
Kunze [6] and the Turing machine verification framework by Forster, Kunze, and Wuttke [9]. 

The certifying extraction framework allows extracting (by definition total) Coq functions 
on first-order types to L and automatically proves correctness. The user can provide a time 
complexity function, which is then automatically verified as well. We use an L-computability 
proof of Turing machine transitions, which is already a case study of the framework. 

The Turing machine verification framework allows giving algorithms in the style of a 
register-based while-language, and a corresponding machine is automatically constructed 
behind the scenes. Separate correctness and verification proofs are then inclusion proofs 
between the automatically derived and the user-given relations for the constructed machine. 


Contribution. The main contribution of the paper are mechanised proofs of the time 
invariance thesis for L w.r.t. both simulation and computability. As a by-product, we obtain 
that the halting problems for Turing machines and L are many-one equivalent, a result we 
contribute to the Coq Library of Undecidability Proofs [11]. We also contribute a Hoare logic 
framework for the verification of correctness, time, and space complexity of Turing machines. 


Outline. The paper is split into four parts: First, Section 3 introduces the weak call-by-value 
A-calculus L with small-step, big-step, and stack machine semantics, and Section 4 introduces 
Turing machines and the Turing machine verification framework from [9]. Secondly, we 
explain the simulation of L computations on Turing machines with polynomial overhead in 
Section 5, and how to obtain a simulator Ssim from the L-computability of Turing machine 
steps [6] in Section 6, which yields a mechanised proof of the time invariance thesis for L w.r.t. 
simulation. Thirdly, we explain how to prove that TM-computable relations R C (LB)* x LB 
are L-computable with polynomial overhead in Section 7. Lastly, we introduce a novel Hoare 


logic verification framework for Turing machines in Section 8, which we use to prove that 
L-computable relations R C (LB)* x LB are TM-computable with polynomial overhead 
in Section 9, which yields a mechanised proof of the time invariance thesis for L w.r.t. 


computability. 
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Related work. Considering only the simulation of models of computation without time 
complexity, we are aware of Xu et al. [29] mechanising the equivalence of register machines 
and Turing machines, Forster and Larchey-Wendling [10] with a mechanised compilation of 
register machines to binary stack machines, and Larchey-Wendling and Forster [18], proving 
that the halting problems of register machines and p-recursive functions, and the solvability 
of diophantine equations are many-one equivalent. In unpublished work, Pous [22] mechanises 
an equivalence proof between counter machines and partial recursive functions in Coq. 


[J Notations and Definitions 


We use the following inductive types: 


1::= () (unit) o: OA ::= None | Some a (options) 
n:N::=0 | Sn (natural numbers) lL:LA:= |] | a:l (lists) 
b : B ::= true | false (booleans) A+ B := inla | inrb (sums) 
Ax B := (a,b) (pairs) 

We use the notation if a is s then bı else bz for inline case analysis, which evaluates to 

bı if a is of the shape s (e.g. for a non-zero number s := Sn or for a list s := || or s := «< :: 1), 


and to b2 otherwise. 
We use the functions map : (A > B) > LA —> LB and map, : (A > B > C) > LA > 
LB — LC, defined as follows 


map f (a :: l) := fa:: map fl maps f (a :: 1) (b :: lg) := fab :: maps f h le 
map f || := |] maps f h l2 := |] 


We use i, n, k, and m as letters for numbers, but try to be consistent in there use: 
numbers of steps are i, number of tapes are n, the arity of input and relations is k, the size 
of inputs are n1,...,m% and the size of the output is m. 

We write P for the type of propositions. R C A x B is short for R: A> B —> P, and 
P C Bis short for P : B > P. In particular, if RC Ax Banda: A, then Ra C B. 

AF is the type of vectors of length k with elements in A. We use bold letters (t, u, ...) 
for vectors and reuse list functions such as map, for vectors. The type Fin, is inductively 
defined to have exactly n elements. We write the elements of the type Fins, as 0,...,n. 

A retraction X — Y consists of a function I : X — Y and an inverse function R: Y > 
OX s.t. Vz. R(Ix) = Some z. 


[37 The call-by-value A-calculus L 


The call-by-value »-calculus was introduced by Plotkin [21] as variant of Church’s à- 
calculus [5]. The concrete variant of the call-by-value -calculus we present here is called 
L [12]. We define syntax and semantics for L in Section 3.1 and introduce a stack machine 
with a heap in 3.2. 


3.1 Syntax, small-step, and big-step semantics 


We define the syntax of L using de Bruijn indices as the syntax of the full A-calculus, i.e. as 
variables, applications, or abstractions. The size |s|| counts the number of constructors in s 
with unary encoded de Bruijn indices. 


s,t u: tm ::=n:N | st | Às |n| :=n+1 st] := |s| + ltl +1 lAs] := 1+ |s| 


We use names for concrete terms on paper, e.g. write (Avy.vx)(Az.z) for (AA11)(A0). 
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Mechanised Time Invariance Thesis for L 


We define a simple substitution operation s/, agreeing with a more standard parallel 
substitution operation when the term substituted with is closed: 


ni” := if n= m then u else n (st)? := sien (As)? := A(s3”) 


Formally, we say that a term s is a closed term if Vnu. sii = s. 
We now define a small-step relation >, its n-step repetition >”, and an inductive 
characterisation of weak, call-by-value big-step evaluation s > v: 
sa tot s>s s'>"t sb du to? t ud pu 


(As)(At) > s9, st> st st> st’ s>°s ay As D? As stp PEES y 


Note that we have for example (Ary.rx)(Az.z) >t Ay.(Az.z)(Az.z). Evaluation is called 
weak because the bodies of abstractions are not evaluated and call-by-value because arguments 
are evaluated before a function is called. Evaluation agrees with evaluation in Plotkin’s 
calculus [21] on closed terms, but does not treat variables as values. 


> Lemma 3. If s is closed, sò’ t if and only if s >" t and t is an abstraction. 


Weak call-by-value reduction is uniformly confluent [20], meaning for a terminating term 
s, we can talk about the time complexity of s without fixing a reduction path. 


> Fact 4. Ifs > tı and s > to, then tı = to or duty > u Atz > u. 


> Corollary 5. If sp" tı and s>”? t2, then ny = na and tı = t2. 


The L halting problem is defined as Halt, (s : tm, H : closeds) := Jt. spt. 
To define L-computability we introduce so-called Scott encodings [14,19] for B, N, and L, 
which internalise the case-analysis behaviour of the respective types. 


true := Ary.x 0 := Ary.x I] := Aay 
false := Azy.y Sn := Azy.yn b: l := Ay 


L 


ybl 


A relation R C LB* x LB is L-computable with time complexity relation r C N!*+* x N if 


s:tm.Vly...ly.(Vl. R(li,...,le)l > Je < 7(|l], lll, -< del). sli... eee DA 


Vt. sh... lko t> Il. R(h,...,k 


Note that time complexity is a relation rather than a function. We will only consider 
functional time complexity relations, and write them on paper as if they were functions. 
However, for undecidable relations R (e.g. expressing halting problems necessary for deducing 
a proof of the time invariance thesis w.r.t. simulation from the one w.r.t. computability), it 
is crucial that 7 is a relation: one could use a complexity function r : Ntt: > N to write a 
Coq decision function checking whether 3l. R(l,,...,1,)l. Since we know that such a decision 
is impossible in the constructive type theory of Coq, time complexity functions can not be 
defined for undecidable problems. 


3.2 Stack machine semantics 


The big-step semantics allows for a compact definition, but is not ideal for implementations 
of L. To prepare for a simulation of L on Turing machines we introduce a stack machine for L, 
utilising references to a heap instead of substitution, similar to the heap machine by Kunze 
et al. [15]. In contrast to the results there, we give a direct correctness proof instead of a 
step-wise refinement via several machines. Our machine is also similar to the heap machine 
by Forster et al. [7], but with fewer reduction rules simplifying verification. 
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Instead of terms, we will work with programs P,Q : Pro := L Com, which are lists of 
commands. Commands are reference, application, abstraction, or return tokens: 


c: Com ::= ref n | app | lam | ret 
The compilation function y : Ter + Pro compiles terms to programs: 
yn := [ref n] qylst) := ys + yt + [app] (As) := lam :: ys + [ret] 


We have q((Azy.xx)(Az.z)) = [lam; lam; ref 1; ref 1; app; ret ; ret ; lam; ref 0; ret ; app]. 

Compiled abstractions start with the token lam and end with ret. We can thus define a 
function ¢P : O(Pro x Pro) that extracts the body of an abstraction by matching the tokens 
like parentheses. We define ¢P := $0, P, where ¢x,Q P is an auxiliary function storing the 
number k of unmatched lam and the processed prefix Q: 


$r,Q [] := None o,o (ret :: P) := Some (Q, P) ġsk,Q (ret :: P) := bk,Q4 fret] P 
or,q (lam :: P) := sk,Q4+llam] P nq (c: P) := dpa P if c= ref n or app 


The states of the heap machine are tuples T, V, H. The control stack T and the value 
stack V are lists of closures g : Clos := Pro x N. A closure (P,a) denotes an open program, 
where the reference 0 in P has to be looked up at address a : N in the heap when evaluating. 

The heap H is a linked list of heap entries e : Entry := O(Clos x N), i.e. an entry is either 
empty, or contains the head of the list and the address of its tail. Given a heap H and an 
address a, H[a] : O Entry denotes the a-th element of H. We define H{a,n] to be the n-th 
entry on the heap starting at address a as follows: 


H{a,n] := if H[a] is Some (Some (g, b)) then if n is Sn then H[b,n] else Some g else None 


We can now define the small-step semantics of the stack machine for L: 


(lam :: P,a) : T, V, H ~ (P’,a) sc T,(Q, a): V,H if bP = Some (Q, P’) 
(ref n :: P,a): T, V, H ~ (P,a) tc T, g :: V, H if H{a,n] = Some g 
(app :: P,a) :: T,g :: (Q, b) :: V, H ~ (Q, |H|) :: (P,a) tc T, V, H + [Some (g, b)] 


Here, (P,a) :¢ T := if P is [] then T else (P,a) :: T. 

In the abstraction rule, the complete abstraction is parsed via ¢ and its body put on the 
value stack. In principle :: instead of ::t¢ could be used to obtain a correct machine, however 
the time complexity of this machine is easier to verify using this optimising operation. 

Similarly, in the reference rule, the body of the abstraction corresponding to the variable 
n is looked up in the heap starting at address a and the result is put on the value stack. 

In the application rule, the machine takes the closure of the called function (Q, b) and its 
argument g from the value stack. The address b is bound to g in the heap, the entry being 
appended to H, thus obtaining address |H|. The machine continues evaluating the body Q, 
where the value for reference 0 can be looked up at address |H|, where it was just placed. 

Given a closed term s, the initial state of the machine is ([{(ys, 0)], [], []), i.e. an empty 
value stack, an empty heap, and the closure (ys,0) on the task stack. In fact, for a closed 
term, 0 can be replaced by any address since there are no free references. An example run of 
the machine for (Avy.xz)(Az.z) can be found in Figure 1, using a as start address. 

To state the correctness of the stack machine we need to define an unfolding operation 
unfg(P,a). We will use functional notation for unfolding on paper, but define it as a 
functional relation in Coq, since the function is not structurally recursive, and not even 
terminating on cyclic heaps. The function unf : Pro —> tm, unfolds programs from the value 
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Mechanised Time Invariance Thesis for L 


unf [some (([ref 0],a),a)] ([ref 1; ref 1; app], 0) 
= unf [some (([ref 0],a),a)],0,0 (unf [ref 1; ref 1; app]) 
[({lam; lam; ref 1; ref 1; app; ret ; ret ; lam; ref 0; ret ; app], a)], [], [] _ ee eee CEs 
~> [([lam; ref 0; ret ; app], a)], [({lam; ref 1; ref 1; app; ret], a)], [] 
~~ [(lapp]; a)], [({ref 0], a), (llam; ref 1; ref 1; app; ret], a)], [ 
~~ [([lam; ref 1; ref 1; app; ret], 0)], [], [Some (([ref 0], a), a)] 
~ [], [({ref 1; ref 1; app], 0)], [Some (([ref 0], a), a)] = A (unf (Some (([ref 0],a),a)],0,0 
(unf[ref 0])) (unftsome (({ref 0],a),a)],0,0 (unf [ref 0])) 
= X(A0)(AO) = Ay.(Az.z)(Az.z) 


= Aunf [some (([ref 0],a),a)],0,1 (11) 
= A (unf [some(([ref 0],a),a)],0,11) (Unf [some(([ref 0],a),a)],0,11) 


E Figure 1 Example of the execution for term (Axy. xx) (Az.z). 


stack into a term by inversing y. It adds À to the result, since only the bodies of abstractions 
are saved on the value stack. The function unf H,a,k : tm, —> tm, substitutes free variables 
n > k in a term by H[a,n—k]. Finally, unf : Prox N —> tm, unfolds a result using the 
two previous functions. The example from above is continued in Figure 1. 


unfP := àt (if yt=P) unfyøakni:=n (ifn<k) 
unfg a, kn := unf gp o(unfP) (ifn > k and H{a,n — k| = Some (P, b)) 
unf Ha, k(st) := (unf Ha kS) (Unf Ha kS) unf Ha k(Às) := A(unf H ,a,skS) unf g (P,a) := unf Ha o (unf P) 


We define the size of the components of a stack machine as follows: 


|ref n| := |n| +1 lapp] := lam] := ret | := 1 In: N| :=n+1 
I(a,b)| := Jal] + [b] +1 IQ := 1 jæ = i = jel + i 
The final correctness theorem then reads: 


> Theorem 6. Let s be closed. 

1. If s ò v then ([(ys,0)], l, 0) ~~? ([, (P,a), H) for some P, a and H s.t. 
unf g (P,a) =v. 

2. If ([(7s,0)].[)) ~* (T, V, H) then (T, V, H)| 
if furthermore ~30.(T, V, H) ~ o, then T = |], 
v s.t. unf (P,a) = v is defined. 


(i+1)- (i+ |s| +1) for some c and 


<e 
V = |(P,a)], and spv for some P, a, and 


3.3 Mechanisation in Coq 


The weak call-by-value -calculus L is a sweet spot for the mechanisation of computability 
and complexity theory — but only since it is engineered to be one. In principle, there is an 
abundance of options which A-calculus to choose: call-by-value or call-by-name, weak or 
strong, variables are values or not, de Bruijn encoding with simple substitution, or with 
parallel substitution, or locally nameless, or parametric higher-order abstract syntax. 

For the implementation of L on Turing machines it is mainly the choice of a simple 
substitution operation based on de Bruijn indices that is crucial. Parallel substitution is 
considerably more complicated to define, since it is not structurally recursive and a priori uses 
functions, i.e. uncountable types, to represent substitutions. In contrast, simple substitutions 
are structurally recursive and only require natural numbers for their definition. 

To simulate Turing machines in L, it is crucial that also a small step semantics is available: 
the non-determinism of >, i.e. that it applies to both sides of an application, allows (directed) 
equational reasoning in correctness proofs without any overhead. 
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‘4 Turing machines 


Turing machines [27] are widely used in books on computability theory, are the standard 
model of computation for complexity theory, and are still considered by many to be the model 
of computation most convincincly capturing all “effectively calculable” functions. Despite 
their universal use, there is however no consensus on how to formally define Turing machines 
in the literature. There are a multitude of definitions which can all be proved equivalent. In 
Appendix A, we compare the definition we use [9] to the one by Hopcroft et al. [13]: 

In Section 4.2, we give an overview of the Turing machine verification framework from [9]. 


4.1 Definition 


We start by defining a tape over type © inductively using four constructors [2]: 
tps, == niltp | leftof r rs | midtp ls m rs | rightof l ls where m,l,r: X and ls, rs: LE 


The representation does enforce that a tape contains a continous sequence of symbols, 
and that either a symbol or one of the ends of the tape is distinguished as head position. 
There are no explicit blank symbols, which allows for a unique representation of every tape 
and no well-formedness predicate is needed. 

We define a type of moves Move, a function mv : Move > tp —> tp applying a move to 
a tape, a function wr : OX —> tp —> tp writing to a tape, and a function curr : tp > OX 
obtaining the current symbol of a tape in Figure 2. 

We define multi-tape Turing machines M :TM$, where n : N is the number of tapes and 
the finite type È is the alphabet, as dependent pairs (Q, ô, qo, halt) where Q is a finite type, 
ô : Q x (OX)” > Q x (OX x Move)”, qo : Q is the starting state, and halt : Q + B indicates 
halting states. The definition of Turing machine evaluation M (q, t) > (q’,t’) and the Turing 
machine halting problem Halttm are defined in Figure 2. 

A relation R C (LB)* x (LB) is TM-computable with time-complexity r C N!+* x N if 


Jn: N. JX. Isbl : X. s Æ bl A IM : TMS tT" Vy. lp. 
(VLR (h, ..-, lk) l > Ji < 7(|l], llil,- --, Ill). dat. M (qo, [niltp, T1, . . . , Te, niltp,..., niltp]) > (q, t) A t[0] = T) A 
Vqti. M (qo, [niltp, 7, ..., Tp, niltp,...,niltp]) >’ (q, t) > Il. R (l1,...,l)l 


with [z1,..., £n] := midtp [] bl [z1], . .. , Zn] and true := s, false := bl. 


4.2 Verified programming of Turing machines 


As presented, Turing machines are not compositional: There is no canonical way how to 


execute a 5-tape machine over alphabet B after a 3-tape machine over O(B x B). 


To allow for the composition of Turing machines and their verification, we first introduce 
labellings in order to abstract away from the state space. A labelled Turing machine over a 
type L, written M : TM$ (L), is a dependent pair (M',labą) of a machine M’ : TM$ and a 
labelling function labyy : Qu > L. 

To prove the soundness of machines, we introduce realisation. A Turing machine M : 
TM3(Z) realises a relation R C tp% x (L x tp$) if 


MFR:=Vt qt’. M(qo,t)> (q, t) 3 R t (laby gt’) 


1 chosen for instance by Wikipedia as reference definition 
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Move :=L|N|R 


mv : Move —> tp — tp 


mv L (rightof l ls) := midtp Js ł [] mv R (leftof r rs) := midtp [] r rs 
mv L (midtp [] m rs) := leftof m rs mv R (midtp lsa []) := rightof a ls 
mv L (midtp (l :: ls) ars) := midtp ls l (a :: rs) mv R (midtp lsa (r :: rs)) := midtp (a :: Is) r rs 


mvmt:=t in all other cases 


wr: OX > tp > tp 


wr Nonet := t wr (Some a) niltp := midtp |] a [] wr (Some a) (midtp ls b rs) := midtp ls a rs 


wr (Some a) (leftof r rs) := midtp |] a (r :: rs) wr (Some a) (rightof l ls) := midtp (l :: Is) a [] 


curr: tp > OD 


curr(midtp lsa rs) := Somea currt := None otherwise 


halt q = false 6(q,mapcurrt) = (q’,a) 
halt q = true M(q',map.(A(c, m) t.mv m (wrct)) at) o (q",t’) 
M(q,t) > (4, t) M(q,t) >" (q",t’) 


Halttmz (M : TMX, t : tp) := dig’ t’. M (qo, t) > (q’, t’) 


Haltrm(n : N, £, M : TMS, t : tpg) := Halttmz (M, t) 
E Figure 2 Definitions for Turing machines. 


Dually, we introduce termination. M : TM$(L) terminates in T C tp% x N if 


MT :=Vti. T ti— Jgt’. M(t) > (¢,t’). 


We call a machine total if 3c. M | Ati.i > c, i.e. if it halts on any tape in at most c steps. 


> Fact 7. The introduced predicates are (anti-)monotonic: 
1. f ME R andVtlt. Rte) Rt (et), then MER. 
2. If ML T’ and Vti. Tti —> T' ti, then M | T. 
We will use the following total machines we call primitive machines: 
Read : TMs(O(X)) E A t (£, t’). L= curr t[0] At’ =t Write s: TMy(1) E A t t’. t’[0] = wr s t0] 
Move d: TM$ (1) E A t t’. t'[0] = mv d t[0] Return l: TME(L)F At (,t’). t =tAl =e 


The last necessary tool now are combinators to compose machines. Given M : TM$ (L) 
and M; : TM3(L’) (for £: L), we introduce the combinator Switch M M’ : TM$(L’), which 
executes M and, depending on the label Z returned by M, executes Mj. 


MER V(€:L). MĻE R, M\T MER Y(t: L).Mi} Tı 
Switch M M’ E Ato (€’,t’). St (€: L). Switch M M’ | At k. 3 kı k2. T t kı A1+ kı +k < k 
R to (L, t) A Rit (,t’) AVLE. Rt (Lt) 31% t ke 


Given a machine M : TM$(O(L)), we introduce the combinator While M : TM$(Z), 
which loops M until the result label is Some £. The realisation relation for While is defined 
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inductively, whereas the termination relation is the co-inductively defined accessibility relation 
(the dashed line indicates coinduction). 


MER R t (Some, t’) R t (None, t’) WhileR R t’ (£, t”) 
While M F WhileR R WhileR R t (£,t’) WhileR R t (£, t”) 
Ttk Yt l. Rt (Somel,t’) > kı <k 
MI\T MER Vt’. R t (None, t’) > Ska. WhileT RT t’ k2^A1+4+ kı + k2 <k 


Composing machines with the operator Switch only works for machines over the same 
alphabet and the same number of tapes. To remedy this situation we introduce lifting 
operations for alphabets and tapes, and also a relabelling operation Relabel. 

Given a retraction f : & —> I, a default symbol d : X, and a machine M : TM}, the 
alphabet lift Tesa M : TMẸ translates every read symbol via f1, passes it to M, and 
translates the symbol M writes via f. In case fT! returns None, d is passed to M. 

Given a retraction I : Fin,, + Fin, and a machine M : TMẸ, the tape lift fr M : TM$ 
replicates the behavior of M on tape i on tape Ji, and leaves all other tapes untouched. 

We only show the canonical realisation and termination relations for the tape lift here: 


fr R:= At (£,t'). R (select I t) (£, select J t’) AV(i: Finn). i ¢ I> t'[i] = tli] 
tr T := At k. T (select I t) k where select I t := map (Ai. t[I(i)]) [0,...,m — 1] 


M:TMS(L)ER I: Finn > Finn MIT TI: Fin, > Finn 
ti M:TMSE fr R tr ML tr T 

Given a function r : Lı — Lə and a machine M : TM$(L1), Relabel M r : TMS(L2) 
behaves like M, but returns label r£ where M returned £. 

The last important layer of abstraction introduces the treatment of tapes as registers, 
based on the notion ¢[i] ~ v expressing that tape t|i] contains an encoded value v: V. A 
type V is a TM-encodable type on alphabet © if there is a (designated) injective function 
€: V — LE. We define such designated encoding functions for several data types, e.g. 
booleans, tuples, and lists of encodable types. In Coq, the implementation of encodable 
types relies on type classes, such that users can define their own encoding functions. 

We then define tape containment t ~p v, where V is TM-encodable on alphabet X, v: V, 
t:tpp+, and f: XT. 


T+ := START | STOP | UNKNOWN | (s : T) 
t ~F- v := Als.t = midtp ls START (map f (ev) ++ [STOP]) 


Note that the position of the head is fixed in the definition of tape containment: The head 
must be located on the start symbol. By extending the alphabet © with the delimiting 
symbols START and END, values can effectively be copied from one tape to another tape. 
The symbol UNKNOWN is used as the canonical default symbol for the alphabet lift. Let 
M : TMs, and f: E >T. Then ff+ M : TMf- (with the canonically inferrable injection 
f* :X* =T?) is an alphabet lift of M. In case the lifted machine reads a symbol that is 
not in the image of f, y+ M behaves like if M reads UNKNOWN. However, this will by 
design not happen if the head is under a symbol of the encoding of a value. 

Void tapes (written isVoidt) do not contain values. The head of the tape is located at 
the right-most symbol: 


isVoidt := dm ls. t = midtp ls m |] 


A void tape can be initialised with a value by writing the encoding of a value delimited 
by START and END. 
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l5] Simulating L on Turing machines 


To simulate L on Turing machines, we use the stack machine semantics, meaning we have to 
implement the relation ~> from Section 3.2 as multi-tape Turing machine Step : TM$. The 
central components of Step are machines implementing the heap lookup operation H[a, n] and 
the parsing operation ¢. We will omit concrete implementations, but show the correctness 
and termination relations and briefly discuss the proof goals for such verifications. The 
machines will share an alphabet © consisting of 30 symbols, allowing to encode commands, 
programs, addresses, closures, task and value stack, heap entries, and heap. 

The relations we display here are simplified in comparison to the actual relations in Coq 
w.r.t. two aspects: First, we omit the retractions fx : Ux — X when writing ~. Since as long 
as concrete fx are fixed for every type X encodable on type “x, their concrete definitions 
do not matter. Secondly, we omit the condition isVoid t both in the premise and conclusion 
of rules: Any unspecified tape is always implicitly void. 


> Fact 8. There is a machine Lookup : TM$ (B) and a c: N s.t. 
1. Lookup F At(¢,t’). VH an. t[0] ~ H > til] ~ a > t[2] ~ n > 
if £ then 3g. H[a, b] = Some g A t'[0] ~ H A^ t'[3] ~ g else H[a, b] = None 


2. Lookup | Ati. 3H an. t[0] ~ H A t[1] xa A t[2] xn Ai > ce- (n+1)- (|A] + max |H |lal) 


Proof. The machine Lookup can be defined by using building blocks like While and Switch. 
Once the machine is defined, an inductive relation R s.t. Lookup E R can be automatically 
inferred from the relations of the building blocks. 
By Fact 7 it then suffices to prove Rt (£, t) > VH an. t[0] ~ H > t[1] ~ a > t|[2] ~ n > 
if £ then 3g. H[a,b] = Some g A t'[0] ~ H A t’[3] ~ g else H[a, b] = None by induction on R. 
The termination proof is dual. < 


> Fact 9. There is a machine Parse : TM$ (B) and a c: N s.t. 
1. Parse F At (£, t').VP.t[0] ~ P > 

if l then IQ P’. 6P = (Q, P’) A t'[0] ~ P’ At’[1] ~ Q else dP = None 
2. Parse | Ati. IP. t[0] ~ P ^i > Jel- P? 


> Fact 10. There is a machine Step : TMH (B) and a c: N s.t. 
1. Step F At (£,t'). YT V H. t[0] ~ T > tl] ~ V > t|2] ~ H > 
if 2 then IT’ V' H’. (T, V, H) ~ (T', V’, H'A t0] = T! At [1] = VAR! [2] = H 
else (~30. (T, V, H) ~ o) AT = || > t [0] eI AUD > V Aa t[2] ~ H 
2. Step E Ati. t[0] > T Atl] ~VAt2] ~ HAi>1+c- if T is (a,P)::_ then 
Ja] + JE} + [V1 + IPI C + [E] + max a |H aaar + |PI) else 0 


This suffices to prove one direction of the time invariance thesis w.r.t. simulation: 


> Theorem 11. There is Msim : TMË and a polynomial p : N >N s.t. for closed terms s 
1. If sv, then there exist t, H, P, and a s.t. Msim([(ys, 0)], niltp,..., niltp) PÈIS) ¢ with 


t|1] = [], t[2] = (P,a), t[3] = H, and unf (P,a) = v. 
2. If Msim([(ys,0)], niltp,...,niltp) terminates, so does s. 


Proof. Define Mgim := While Step and pim := c- (3i + 2)?-(3i+1+m)-m for some c. < 
> Corollary 12. Halt, < Haltty 


A first version of the Coq verification of Mgim was also discussed in Wuttke’s bachelor’s 
thesis [28]. 
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‘6 Simulating Turing machines in L 


To simulate Turing machines in L, we first give an alternative, executable semantics for 
Turing machines based on iteration of a step-function. We then recap the central ingredients 
of the certifying extraction framework [6], which we use to extract the step-function to L. 
And lastly, we implement a (potentially non-terminating) iteration combinator in L. 

We define a function nxtm : Qm X tp% > Qm X tp +tp% and a polymorphic function 
loop : (X > X +Y) > X >N > OY as follows: 


nxtm (q, t) := if haltg then t else let (q',a) := ôm (q, curr t) in (q', mapy(A(c, m)t.mv m (wr ct)) at) 
loop f x0 := None loop f z (Sn) := loop f z'n (if fa = inl z’) 


loop f x (Sn) := Some y (if fx = inr y) 


> Fact 13. loopnxtm (qo, t) (Si) = Some t’ + Aq’. M (q, t) > (q', t”) 


The certifying extraction framework [6] automatically extracts L-terms sp for functions 
f and proves that s¢ computes f. Additionally, one can pass a time complexity function 
Tf and is then presented with proving certain recurrence equations for Tf. Since we do not 
implement higher-order functions, we can give a simplified account of the framework here. 

Central in the framework are Scott encodings, which are used to encode elements of 
arbitrary, first-order types as L-terms. The idea behind Scott encodings is that case analysis 
is by application: For instance, if b then a, else az corresponds to the L-term b Sa, Saz, 
where Sa, and Sa, compute a, and ag respectively. 

In general, for types Aj,..., An, B with a Scott encoding, a function f : A, > ...A, > B, 
is computed by a term sf with time complexity function Tẹ : Aj > --- > An > N if 


Vay,...,€n. dk < Ty a1... Gn. Sf Ci stidge” fda. te 


The framework also supports higher-order functions and currying, but we omit those 
features complicating e.g. the definition of time complexity since we do not rely on them. 

The certifying extraction framework comes with a library of computability proofs including 
time complexity, covering natural numbers, list, and vectors. Furthermore, one can give a 
general computability proof for functions with listable domain. I.e. if there is [z1,.. . , £n] : LX 
s.t. Ve: X.x € [x1,...,v,] and f: X — Y, then there is sf and a constant c s.t. 
Va.di<c-n.s To fax. 
Since ĝm has a listable domain, we can use the certifying extraction framework to extract 
nxt for every M: 


> Fact 14. Let M : TM. There is nxtm : tm, and Cr: N s.t. Snxty, (q, t) 0O™ nxt(q, t). 


Proof. The framework generates the proof obligations 52- |Q m|? +56 < Crt, and 52-|Qyy|?+ 
130 - n + 216 < Cre, where |Qm] is the number of states of M. If C is picked large enough 
before running extraction, the obligations can be discharged automatically by the tactic 
solverec provided by the framework. < 


We now define a term Sloop Which expects f and x and loops f until a value y is found, 
or indefinitely if not. Since the extraction framework only covers total functions, we have to 
manually implement Sloop. To do so, we rely on a recursion combinator p [12], also employed 
by the framework to use recursion: 
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> Fact 15. If s and s’ are closed abstractions, pss’ >° s(ps)s’. 
We then define Sloop with time complexity function Toop : (X > N) > X > NN: 


Sloop = p(Arfa. fa(An'z.rfa’)(Ayz.y)(Az. z)) 
rere ei := 0 Toop Tf £ (Si) := Tea + 11+ if fæ is inlz’ then Toop Tf xi else 0 


> Lemma 16. Let f be computable by sz with time complexity function Ty, i.e. Yx.Ji < 
Tru. spb! fx. Then 


1. Ifloop f ri = Somey, then Ak < Toop Tf £i. Sloop Sf TiD J. 
2. If Stoop Sf T terminates, there exist i and y s.t. loop f xi = Somey. 


This suffices to prove one direction of the time invariance thesis w.r.t. simulation: 


> Theorem 17. There is Ssim : tm, s.t. for all M : TM$ there is C:N s.t. for all t : tp$ 
1. If M(qo,t) > (q,t'), then 3j < C -i+ C. ssim Sneu EDI Y. 
2. If Ssim Snxtu tov, then Agt’. M (qo, t) > (q, t’). 


‘7 TM-computable relations over LB are L-computable 


Let R C (LB)* x LB be computable by M : TM$. We define an L-term computing R by 
taking [1,...,l, as input, converting them to their respective TM-encoding, and then running 
M with the help of Ssim. Step-by-step, s has to: 


1. Expect input in the form sli... lp, 

2. for every 1 <i < k compute midtp |] bl/;, i.e. the L-encoding of the TM-encoding of li. 
3. run the simulation Ssim Spxt,, [niltp,t1,...,¢,, niltp, ..., niltp]. 

4. this computation will (if it terminates) terminate with a value (midtp [] bl, t5... , th), 
5. 


meaning s has to output J. 


Three challenges arise: the term s has to be defined parametric in k, the L-encoding of 
the lists 11,...,l, has to be converted to the L-encoding [niltp, t1,...,¢,, niltp,...,niltp], and 
the L encoding of a result t’ has to be analysed, and the TM-encoding of a list | contained 
t’[0] has to be converted to the L-encoding of l. 

For the first task, we implement k-ary substitutions and combinators. 


> Fact 18. One can define functions s% :tmzwhere s:tm_,n:N,u:tmi and 
Ap : tmp > tm app, : tm_ > tm; — tm. Varsz : tm# 


such that the following hold: 
1. varss, = k :: varsp, 


2. (app;,5($1,---,8k))a = aPPx (Su) (1u ++» (r)a) 
3. if all elements of u are closed abstractions, then app, (Axs)u =" s9. 


The second and third tasks can again be done by extraction. 


> Fact 19. There is a closed abstraction Sprep $-t. Sprep (l1, . - . , lẹ J>fniltp, t1,..., tp, niltp,..., niltp], 
where ti := midtp [] bl [77,..., Zn] for li = [v1,..., Enl. 

> Fact 20. There is a closed abstraction Sunencry, $-t. if t[0] = midtp [| bl [77,...,%n] and 
l = [r1,..., Un] we have Sunencryt > Somel and Sunencryt > None otherwise. 


> Theorem 21. If R C (LB)* x LB is TM-computable by a machine M with time complexity 
relation T there are a term sm and a constant c s.t. sy computes R with time complexity 


relation (M, Nn1,... Nk) © CM- N1: NE TÍM, Ni,- Nk) +e. 


Proof. Define sm := Àk-Sunencru(Ssim Snxtyr (Sprep(Sconsk(. - - (Scons 01))))) (Ax.x) Į. < 
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8 Hoare logic verification framework for Turing machines 


Although the relational verification approach is quite powerful, it suffers from two crucial 
problems. First, realisation and termination are inherently separated proof goals, even for 
total machines, which form the majority of machines we are interested in. The premises of 
the correctness and termination relations are usually almost the same, thus manipulation of 
premises is duplicated. Secondly, in an interactive proof the proof context can grow very 
large. When two machines with say 9 tapes are sequentially composed, the proof context 
contains 9 assumptions on the initial tapes, 9 assumptions for the intermediate tapes between 
the two machines, and 9 final propositions on the tapes. Many of these assumptions are 
equalities. The Turing machine verification framework has naive tactics to simplify such 
proof goals by substitution and rewriting, which becomes quadratically slower the longer a 
realisation proofs takes. 

We propose a verification framework based on Hoare logic, solving both problems. 
Correctness and termination can be proved at once, eliminating the need for repeated 
manipulation of premises. Furthermore, at each step in the verification of an n tape TM, 
the user sees exactly n specifications S for the n tapes, plus optionally a custom invariant 
depending on both the tapes and the label. 

We here give a high-level overview. More details are in the separate Appendix B [8]. 

The Hoare logic is built as a new layer of abstraction for the relational framework. Given 
M : TMŞ(L), a predicate P C tp% and a relation Q C L x tp%, we write weak Hoare triples: 


E{P} M {Q}:= ME (åt (£, t). Pt > Q (£ t)) 


To state that the machine is functionally correct and terminates in a certain time, we use 
total Hoare triples. In addition to the relation of the weak Hoare triple, a total Hoare triple 
asserts that the machine terminates in (at most) i steps if the precondition is satisfied. Thus, 
we avoid spelling out the precondition again. 


E {P}’ M {Q}:= (EF {P} M {Q} A ML Ati. Pt A i<7) 


Similar to the old verification framework, every building block like While and Switch 
comes with an associated Hoare triple, shown in Figure 1 in the separate Appendix B [8]. 
Using these rules, an interactive verification is akin to a symbolic execution of machines with 
explicitly annotated invariants. 

The Hoare triples of user-defined machines M : TM$ exclusively have triples with both 
pre and post conditions of the form S1 A...S, AI where S; for 1 <i < n are either ¢[i] ~ x 
for some 2, isVoid €[i], or t[i] = to for some fixed to, and J is a custom, user-chosen invariant. 
Thus, the following specification for a binary machine 


M F At (£, t). Viv: X) (y : Y). Pxx > Pyy > 
t[0] = x > t[1] ~ y > t'[0] ~ f(x,y) A tT] ~ g(x,y) 
M | Ati. I(x : X) (y : Y). Pxx A Pyy A t[0] ~ x Atl] ~ y Ai > ray 


can be compactly restated using the following triple: 


Yzy. Pxxz > Pyx > E {At. t[0] ~ x Atl] ey} M {A(, t). t [0] ~ f(£,y) At [L] ~ glz, y)} 


A typical workflow for the verification of total machines then looks as follows: First, the 
user defines Hoare triples for their machines, following the shape S1 A--- A Sn A I. Secondly, 
they perform the correctness proof. Thirdly, they define the time complexity function of the 
machine, and add the running time to the triple in the proved lemma. Fourthly, they replay 
the proof script and in the very last step of the verification, show that the accumulated 
running time is indeed bounded by the time complexity function. 
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‘9 L-computable relations over LB are TM-computable 


To convert a term s computing a relation R, we follow the same strategy as in Section 7. We 
exemplarily show the specification of the machine Meony corresponding to SunencTM: 


> Fact 22. There is a machine Mynenct : TMS; and a constant c s.t. for alll: LB we have 
E {P}°l Munenct {Qi} where P, := At. t[0] ~ TA t[1] = niltp A isVoid [2] A isVoid t[3] and 
Qi := A(_,t’). t[1] = midtp [] bI 7 A isVoid t[2] A isVoid t[3]. 


> Theorem 23. If R C (LB)! x LB is L-computable by a closed term s with time complexity 
relation T there is a polynomial p and a Turing machine Ms s.t. Ms computes R with time 
complexity relation (M, ni, ... Nk) = p(m,n1,...,M%, TRIM, N1,...,Nk)). 


Proof. The machine Ms is defined parametric in s and by recursion on k, i.e. we define a 
different machine for each k. First, Ms reads its input and writes the task stack [(y(s m1 Tig), 0)] 
to an auxiliary tape. As a total subroutine, we verify this part using the Hoare framework. 
Then M, runs Msim. If the computation terminates, the resulting heap will contain a term 1 
for a list l : LB. Thus, Ms runs a machine implementing unf and lastly runs Monencc- 

The size of the heap after i steps is in O(i - (i + N)), where N := ny +--+ ny. For 
one reduction step, N variables might have to be looked up, resulting in a runtime of 
O(i-(¢+N+1)-(N+1)) per step. Unfolding a heap takes O((m+ 1) - (H +N +1)), where 
H is the size of the heap. In total, we can thus define the polynomial p cubic in the number 
of steps i, quadratic in N and linear in the size of the output m as follows, where c is a 


constant: 
(Mm, ni,... nk, i) e (i41) (i+ni+ tnk +1) (np +--+: +ng4+1)-G4+1)4+m). < 


HMO Discussion 


We have presented the first mechanised proof of an instance of the time invariance thesis, 
connecting L with Turing machines with a polynomial overhead in time. We prove two 
variants, respectively concerned with simulation of one model of computation on the other, 
and with the computability of relations on boolean strings. In total, including dependencies, 
our development consists of 30.000 LoC and takes about 18 minutes to compile on an Intel 
Core i7-6600U CPU @ 2.60GHz machine. The novel code for this paper still constitutes 7800 
LoC, with about 40% specification and 60% proofs. 

It is folklore that the two variants are equivalent. For instance, in [1,7,23] more emphasis 
is put on the simulation variant. In the interactive theorem proving community it is however 
well-known that “folklore” is not equivalent to “easy to mechanise”. In the setting of 
computability and complexity theory, where one has to deal with models of computation, 
this fact is amplified: proofs on paper virtually never provide concrete implementations in a 
model of computation, but focus on the high-level invariants of a proof. This is based on the 
(again folklore) fact that algorithms can be implemented in models of computation provided 
enough time and strength to sustain the tedium to do so. However, the proof engineering 
time needed to show the correctness of an algorithm (as e.g. a Coq function) and to show 
the correctness of its implementation (as e.g. a Turing machine) are completely independent. 
Correctness proofs of algorithms depend on the intricacy of invariants. Correctness proofs of 
implementations depend on the length of the code of a Turing machine, and the size of the 
gap between specification language and Turing machines. If the specification is a first-order, 
tail-recursive Coq function, the length of the Turing machine is the main factor. 

We however hope that future mechanised proofs of results in complexity theory and of 
the time invariance thesis for other models of computation can profit from our development. 
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For us, simulating Turing machines on L and simulating L on Turing machines was the only 
possible way. Now that L is shown reasonable for time complexity, future proofs of the time 
invariance thesis can choose which reasonable model to use for each direction of the proof. 
For the time invariance thesis for a model of computability C, one can show that C can 
simulate Turing machines, which are structurally simple and where computation is local. In 
the other direction, one can show that C can be simulated in L, which has rich structure and 
supports non-local computation on almost arbitrary (first-order) data-structures. 

For our concrete cases of Turing machines and L, powerful abstraction layers and veri- 
fication frameworks are necessary for mechanisation. We would assess that a 
approach to any invariance thesis involving Turing machines is completely unfeasible. The 


“manual” 


Turing machine verification framework [9] and the certifying extraction framework [6] proved 
very valuable in this regard. While having similar goals, they use different mathematical 
approaches to both verification and time complexity analysis. 

First, the certifying L-extraction framework gives support to automatically prove the 
computability of a large subset of Coq functions. For time complexity, a user has to give a 
time complexity function as input, and the framework automatically generates equations the 
function has to fulfil and furthermore provides tactics to solve these equations. Finding a 
time complexity function can prove challenging, but an interactive approach where a wrong 
function is picked and a correct function is reverse-engineered from the equations works 
well. A priori, the framework does not support partial terms. The Lsimp1 tactic used in the 
framework however can be used to normalise terms in manual verification. 

Secondly, the Turing machine verification framework provides tools to verify the correct- 
ness, time and even space complexity of Turing machines, but the user has to implement 
those machines manually. For the implementation, the user can write Turing machines in the 
style of a register based while-language, for which a canonical realisation and termination 
relation are inferred automatically. Correctness and time complexity proofs w.r.t. user-defined 
relations are now simply inclusion proofs between the canonical and the user-defined relations. 
Due to the split of correctness and termination, the framework works well for the verification 
of partial machines. However, for total machines, termination and correctness are distinct 
proof goals and have to be proved separately, which leads to a mathematical duplication of 
similar proof goals, sometimes even to actual proof code duplication. 

The novel Hoare logic verification framework we present remedies this situation: For total 
machines, only one proof subsuming both correctness and time complexity has to be carried 
out, while it still supports separate proof goals for partial machines. No canonical relations 
are used. The verification of Hoare triples is carried out directly on the implementation of 
the machine, using the proof rules for the used combinators and user-defined machines. We 
conjecture that the Hoare logic framework scales further and possibly far. Our simulation 
of L on Turing machines is reasonable on time, but if terms exhibit pointer explosion [7], 
the space overhead might not be constant factor. The proof of this stronger time and space 
invariance thesis for L [7] is considerably more complicated, but might now be in reach. 

However, the assessment that “Turing machines as model of computation are inherently 
infeasible for the formalisation of any computability or complexity theoretic result” [9] still 
stands. Future developments of mechanised results in complexity theory can and should be 
based on L or similarly well-suited models. Our certifying extraction framework provides 
valuable help in doing so, but there are interesting extensions worth exploring in the future. 

First, this framework does not cover space complexity, which would however be needed 
for a proof of the full invariance thesis for L. Secondly, exploring an automatic generation 
of time-complexity functions, where the user can then provide a (polynomial) upper bound 
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for the function in a second step might be interesting. Lastly, the framework requires user 


input for every single recursive function used, also for auxiliary functions, and the automatic 
verification of these is based on Ltac tactics which sometimes fail, and sometimes are slow. 
Here, an automatic generation of correctness proofs based on a meta-programming tool for 
Coq like MetaCoq [24,25] might be a possible solution. 
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lA Definition of Turing machines 


We compare the definition of Turing machines we use [9] to the one by Hopcroft, Motwani, 
and Ullman [13]. 


1. The logical system in [13] is classical set theory, whereas we work in constructive type 
theory. We discuss the impact of this foundational choice below. 

2. The alphabet in [13] is separated into a set of input symbols © and a superset of tape 
symbols T, where we unify both into a single type. 
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3. The blank symbol in [13] is an explicit part of Turing machines as an element of I’, but 
not of X}. We do not specify blank symbols explicitly and instead leave it to a user to 
specify a blank symbol or even various blank symbols. 

4. Tapes in [13] are not formally defined. It is only stated that tapes ‘extend infinitely to 
the left and right, initially hold|ing] |...) the input’. Instantaneous descriptions of Turing 
machines are formally defined as strings over [ and Q, which contain ‘the portion of the 
tape between the leftmost and the rightmost non-blank, unless the head is to the left of 
the leftmost non-blank or to the right of the rightmost non-blank’ In the latter case, the 
blanks between the head and the non-blank content are part of the string. 

5. The transition function in [13] is a partial function, whereas ours is total. If the transition 
function is unspecified, the computation of the machine halts, whereas we have an explicit 
boolean halting function. In Coq’s type theory one requires classical logic and AUCy.n to 
compile Turing machines with a partial transition function into an equivalent definition 
with total transition functions. In general, any compilation of partial functions on finite 
types to total functions is non-computable and thus not definable in Coq’s type theory 
without axioms. 

6. A machine in [13] always writes a symbol and always moves to the left or right. We allow 
to not write a symbol and to not move the head. The first is important regarding our 
definition of tapes (otherwise a fully empty niltp can never stay fully empty), whereas the 
second is a relatively arbitrary choice to allow more freedom in the definition of concrete 
machines. 

7. Turing machines have an explicit set of accepting states in [13]. We do not add one, 
because our definition is not aimed at formalising computability theory directly. Instead, 
our more flexible definition of labels subsumes the binary notion of accepting states, but 
allows for more interesting constructions like the Switch and MemWhile machines. 


Subtle difficulties might arise when defining notions of computability theory in classical 
set theory. For instance, defining Turing machines with a transition function which takes as 
input the whole tape is problematic: Nothing permits non-computable transition functions 
then, making arbitrary problems decidable by encoding the decision into the transition 
function. When imposing the transition function to be computable one obtains a circular 
dependency: The notion of computability is needed to define transition functions, transition 
functions are needed to define Turing machines, but Turing machines are needed to define 
the notion of computability. The well-known solution here is to define Turing machines as 
finite objects, i.e. let the transition function, although partial, work on a finite domain and 
codomain. In classical set theory one can then always show afterwards that this transition 
function is computable, since every function with finite domain and codomain is. In our 
type-theoretic setting, we can also show that the transition function is computable, provided 
it is, as we defined it, a total function. We did so in Section 6. 


